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Abstract 

Bell's theorem is purported to demonstrate the impossibility of a local "hidden vari- 
able" theory underpinning quantum mechanics. It relies on the well-known assumption 
of 'locality', and also on a little-examined assumption called 'statistical independence' (SI). 
Violations of this assumption have variously been thought to suggest "backward causation" , 
a "conspiracy" on the part of nature, or the denial of "free will". It will be shown here that 
these are spurious worries, and that denial of SI simply implies nonlocal correlation between 
spacelike degrees of freedom. Lorentz-invariant theories in which SI does not hold are easily 
constructed: two are exhibited here. It is conjectured, on this basis, that quantum-mechanical 
phenomena may be modeled by a local theory after all. 

1 Introduction 

The violation of the Bell-CHSH inequality by quantum mechanics is commonly understood to 
undermine the possibility of "local" hidden- variable theories, i.e., theories which either supple- 
ment quantum mechanics with additional variables (e.g., actual particle positions in the Bohm 
theory dH dS]) ) , or replace the quantum- mechanical description with something else entirely. This 
"no-go" result is known as Bell's theorem ([1]). The argument is essentially that the assumption 
of a certain kind of locality - known as 'Bell locality', ' factor iz abilit v ' (| 1 2|) . 'strong localitv' (|15p . 
or simply 'locality' (Tl) - is a sufficient condition to derive the inequality, and the predictions of 
quantum mechanics violate this inequality. (The locution "strong locality" suggests that there 
are weaker forms of locality, and indeed it was shown by Jarrett (llSp (I16p and by Shimony (12 ip 
that 'strong locality' is the conjunction of two weaker locality conditions, referred to by Shimony 
as "outcome independence" and "parameter independence".) 

However, there is an additional, nontrivial assumption that goes into the Bell argument, and 
this is the assumption of statistical independence (SI). Its role in the derivation was understood 
by Bell and others, but it has been little examined because it has for the most part been thought 
to be beyond question, since violations of it have seemed to most to entail either violations of 
"free will" , the possibility of "backward causation" , or some sort of cosmic conspiracy on the 
part of nature. I will show that, to the contrary, violations of SI entail none of these, and I will 
in fact offer in support of this contention two examples of classical, Lorentz-invariant theories 
that violate SI. 

The paper proceeds as follows 

'This paper is dedicated to the memory of John A. Wheeler. 
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• Section 2 rehearses the way in which the constraints of factorizabihty and statistical inde- 
pendence come into the derivation of the Beh-CHSH inequahty. 

• Section 3 shows that SI entails that spacelike degrees of freedom are independent, and that 
violation of SI implies that spacelike degrees of freedom are not independent (rendering 
the term "degrees of freedom" something of a misnomer) . 

• In section 4, it is argued that violation of SI does not lead to problems with free-will, 
backward causation, etc. 

• In section 5, two examples of 5*/- violating theories are offered. 

• In section 6, it is shown that in fact violation of SI leads naturally to the "contextuality" 
(of the values of various degrees of freedom) demanded by the Kochen-Specker theorem. 

• In section 7, we discuss future prospects for a theory with nonlocal constraints. 

• Section 8 comprises some concluding remarks. 

2 Bell's theorem 

The thought experiment at the core of the Einstein-Podolsky-Rosen (EPR) paper on the in- 
completeness of quantum mechanics (|lip involves a pair of particles in an entangled state of 
position and momentum, a state which is an eigenstate of the quantum-mechanical operators 
representing the sum of the momenta and the difference of the positions of the particles. Quan- 
tum mechanics makes no definite predictions for the position and momentum of each particle, 
but does make unequivocal predictions for the position or momentum of one, given (respectively) 
the position or momentum of the other. EPR argued that this showed that quantum mechanics 
must be incomplete, since measurement of the position (or momentum) of one particle could 
not simultaneously give rise to a definite position (or momentum) of the other particle, on pain 
of violation of locality. They concluded that quantum mechanics, because it did not assign a 
position (or momentum) to the other particle beforehand, must be incompletej^ 

Bohm's streamlined version of the EPR experiment (3) involves the spins of a pair of particles 
(either fermions or bosons) rather than their positions and momenta. Prepared in what has come 
to be known as a "Bell" state. 



quantum mechanics predicts that a measurement of the component of spin of particle A in any 
direction (e.g., z) is as likely to yield -|-1 as —1 (in units of h/2), and so the expectation value A 
is 0. However, quantum mechanics also indicates that an outcome of -|-1 for a measurement of 
the spin of A in the z direction is guaranteed to yield an outcome of —1 for B for a measurement 
of the spin of B in the z direction, etc. This is directly analogous to the correlations between 
position and momentum measurements in the original EPR experiment. 

In and of themselves, these phenomena offer no barrier to a hidden- variable theory, since it 
is straightforward to explain such correlations by appealing to a common cause - the source - 
and postulating that the particles emanate from this source in (anti)correlated pairs. However, 

^The argument of the EPR paper is notoriously convoluted, but I follow l|13p in regarding this as capturing 
Einstein's understanding of the core argument. 
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such an explanatory strategy must also account for the way that the anticorrelation drops off as 
the angle between the components of spin for the two particles increases (e.g., as A rotates from 
X toward z while B remains at x). It was Bell's great insight to note that the quantum theory 
implies that the anticorrelation is held onto more tightly than could be accounted for by any 
"local" theory. Bell showed that the predictions of a local theory must satisfy an inequality (a 
precursor to the Bell-CHSH inequality below), and that this inequality is violated by quantum 
theory for appropriate choices of the components of spin to be measured. 

In order to understand the role of the locality assumption and the statistical independence 
assumption, let us briefly review the derivation of the Bell-CHSH inequality. The physical 
situation we are attempting to describe has the following form: 





A source (represented by the ellipse) emits a pair of particles, or in some other way causes 
detectors A and B to simultaneously (in some frame) register one of two outcomes. The detectors 
can be set in one of two different ways, corresponding, in Bohm's version of the EPR experiment, 
to a measurement of one of two different components of spin. 

Let us now suppose that we have a theory that describes possible states of the particles 
by a discrete or continuous parameter A, describing either a discrete set of states Ai, A2... or a 
continuous set. We will also suppose that the theory provides us with predictions for the average 
value A{a, A) and B(b, A) of measurements of properties a and b at detector A and B in any 
given state A. (The appeal to average values allows for stochastic theories, in which a given A 
might give rise to any number of different outcomes, with various probabilities.) In general, one 
might suppose that A also depended on either the detector setting b or the particular outcome 
B (i.e., A = A(a, X,b, B)) and similarly for B. That it does not, that the expectation value A 
in a given state A does not depend on what one chooses to measure at B, or on the value of the 
distant outcome B (and vice-versa) is Bell's locality assumption. Given this assumption, one 
can write the expression E{a, b, A) for the expected product of the outcomes of measurements 
of properties a and 6 in a given state A as 

^(a,6,A) = ^(a,A)^(6,A). (2) 

This condition is also known as 'factorizability', deriving as it does from the fact that the joint 
probability of a pair of outcomes can be factorized into the product of the marginal probabilities 
of each outcome. We can thus represent the analysis of the experimental arrangement in this 
way: 
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E(a„b2,>.) = A(a„;V)B(b2,X) 

Now, a theory that accounts for our observations will presumably do so in part by giving a 
probability distribution P{X) over the various possible states associated with a given "prepara- 
tion" (a given set of circumstances in the region of the ellipse in the diagram above), and the 
expected outcome E{a, b) will then be given by the weighted sum (we restrict to discrete A for 
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simplicity) 

E{a, b) = ^E{a, b, X)P{X\a, b) = ^A{a, X)B{b, X)P{X\a, b) (3) 

A A 

where P(A|a, 6) is the probabihty of A given detector settings a and b. Thus the expected value 
for the product of a measurement of spin components ai and 62 is 

E{ai, 62) = Y.^{ai, X)B{b2, X)P[X\aiM) , (4) 

A 

the sum of the products of the expected values of the outcome at A, the outcome at B in each 
state A (Ai, A2, etc.), and the probability of that state. If the probability of A is independent of 
the detector settings a and b, then one can replace P{X\a,b) with P(A). This is the condition 
known as Statistical Independence {SI), and as we shall now see, it is crucial to the derivation 
of Bell's result 

The Bell-CHSH inequahty ©, is: 

\E{ai, bi) - E{ai, 62)! + \E{a2, 62) + ^(^2, b{)\<2. (5) 

The beginning of the derivation goes as follows. First, write down the difference between 
expectation values for pairs of settings (ai,6i) and (ai,&2) : 

E{aiM) - E{aiM) = A)^(6i, A)P(A|ai, 61) - ^^(ai, X)B{b2, A)P(A|ai, 62) (g) 

A A 

Assuming that SI holds, we can rewrite this as 

E{aiM) - EiaiM) = J]A(ai, A)5(6i, A)P(A) - J]l(ai, A)5(62, A)P(A) (7) 

A A 

The key step, which allows the introduction of £^(02,62) and £'(02,61), involves expanding this 



as 



5^^(ai, A)S(6i, A)P(A)(1 ± i(a2, X)B{b2, A)) 

E{ai,bi) - E{ai,b2) = _ _ _ (8) 

-Y,A{ar,X)B{b2,X)P{X){l± A{a2,X)B{b^,X)) . ^ ' 

A 

This then leads to ^ via rearrangement of terms and manipulation using the relations \x\ \y\ = 
\xy\ and |x + |/|<|x| + |y|. For our purposes, though, the crucial step is ([7|, in which essential 
use is made of SI. If we were not to assume SI, then ([T]) would revert to ([6| and we would have 
to rewrite ([s]) as 

Y,A{ai,X)B{bi, A)P(A|ai, 6i)(l ± A{a2, X)B{b2, A)) 

E(ai,bi) - E(ai,b2) = _ _ _ (9) 

-Y,Mai,X)B{b2,X)P{X\ai,b2){l±A{a2,X)B{bi,X)) 

A 

which is simply invalid, since the new terms need not sum to zero anymore. Nor, for that matter, 
would they correspond to the desired E{a2, 62) and E{a2, 61 )• Thus without appeal to SI, there 
is no way to introduce the other two expectation values and derive the inequality. 
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3 Statistical independence revisited 

The assumption of SI has been cahed into question only infrequently, but when it has, the 
critique has often been motivated by an appeal to the plausibility of Lorentz-invariant "backward 
causation" , whereby the change of detector settings gives rise to effects which propagate along 
or within the backward lightcone and thereby give rise to nontrivial initial correlations in the 
particle properties encoded in A (e.g., (f8|).(jM]). p9l) ). In this section, I will argue that this is 
an inappropriate way to motivate the rejection of SI, and that its rejection instead involves a 
relativistically nonproblematic commitment to nonlocal constraints on initial data. 

Depicted in Figure [T] is a run in which the setting of A is changed from ai to 02 while the 
particles (or whatever it is that emanates from the source) are in flight. 

The different colors subsequent to the arrival of the particles at the two detectors correspond 
to distinct experimental outcomes. 

Now a clear-thinking student of relativity should suspect that something is amiss with this 
argument, since all deterministic theories in use today already sanction a form of backward cau- 
sation, in that they allow both prediction and retrodiction. All special and general-relativistic 
theories have a well-defined causal structure which makes no distinction, other than a conven- 
tional one, between future and past. Specifying the physical properties (the Cauchy data) at 
each point on either of the shaded surfaces suffices to determine the physical situation at E 
(Figure [2]). The future data determine the event E just as much as the past data. And given an 
appropriate description of the future data — a description which adopts a "backward-directed" 
temporal orientation — one can regard these data as the cause of the event E^ 

Let us put aside any qualms we might have regarding the notion of backward causation for 
the moment and examine the particular situation of the EPR-Bohm experiment more closely, in 
the hope that this will shed some light. Suppose we simply temporally invert the situation above, 
as in Figure [3] This looks like a pair of sources, A and B, emitting particles in the direction 

^For example, we seek the cause of an explosion in the past, but if the very same event were described from 
an inverted temporal perspective, described as an implosion, we would look in the opposite temporal direction. 
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Figure 2: Past and future domains of dependence 
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of a common destination. Is it not reasonable to expect that the "final" state A = {Xa,^b) is 
correlated with the settings of the sources? 

In fact, it is not. Suppose we know some, but not all, of the data in the past lightcone of 
an event E. Suppose, for instance, that we know that the white region is empty of physically 
significant data, and suppose we know the data in the red and blue ellipses, but not in the 
turquoise ellipses indicated by question-marks: 



E 




In such a case, we know nothing more about what to expect at E than if we had no information 
at all. Given the ability to choose data in the turquoise ellipsoids, we can make E whatever we 
want. 

But now suppose we fill in the unspecified ellipses: 



E 




Then we know data on a Cauchy slice of the past lightcone of E, and E is fully determined. 
It is this situation that is the appropriate parallel of the time-reversed EPR experiment; the 
newly-specified data correspond to the outcomes of the two trials. 

Any plausibility that the particles might be causally correlated with the detector settings 
derives from a situation in which the detectors themselves are the sources of the particles, rather 
than mere conduits. In the EPR case, there are meaningful, (anti) correlated detection events, 
and in the time-reversed picture these detection events serve as additional data, additional 
sources. (One can move the slice so that it is prior to any outcome, but one will still have 
to contend with the fact that only a complete specification of the physics inside the detector, 
including the state of the particles, is sufficient to determine E.) 

The upshot, then, is that postulating a correlation between detector settings and the initial 
state A (corresponding to E in the figure above) - i.e., dropping SI - amounts to postulating a 
correlation between the detector settings and the particle properties at any given time. I.e., the 
correlations are not causal - they are not brought about dynamically - but are properties of the 
state at any instant. The role of dynamics in a proper theory describing quantum phenomena 
is to enforce, not generate, such correlations. The challenge, as yet unmet, is to articulate the 
constraint which encodes these correlations. In section 6, we will examine two theories with 
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appropriately nonlocal constraints in order to develop intuition. But first, let us examine two 
other problems which are purported to arise in the rejection of SI 

4 Superdeterminism and free will 

The idea that the rejection of SI involves not some dynamical process like "backward causation" 
but rather some preexisting and persisting correlations between subsystems has been broached 
before, under the terms 'conspiracy theory', 'hyperdeterminism', and 'superdeterminism'. Bell 
([2]), Shimony (l22|l . Lewis (fT8]l and others have suggested that proposing a correlation between 
detector settings and particle properties involves some sort of conspiracy on the part of nature. 
This is frequently accompanied by the charge that the existence of such correlations is a threat 
to "free will". Let us address these worries. 

4.1 Conspiracy 

The idea that postulating a correlation between detector settings and particle properties involves 
a "conspiracy" on the part of nature appears to derive from the idea that it amounts to postu- 
lating that the initial conditions of nature have been set up in anticipation of our measurements. 
It might be supposed, analogously, that every time I telephone my friend Jenny at 867-5309, 
something - perhaps a cosmic ray - causes my message to be misdirected to the non-working 
number 867-5308, so that I appear to live in a Kafkaesque world in which my efforts to contact 
Jenny are forever stymied. This would appear to be a world in which nature, in the form of 
particularly vicious initial conditions, conspires against me to the point where I am driven to 
postulate that it is a law of nature that I cannot successfully contact Jenny (except, perhaps, 
on her mobile phone). But according to the way the story is told, it is really just an accident 
that I cannot successfully make contact. Similarly, the conspiracy theorist views the appeal to 
a failure of SI in order to explain the strange correlations predicted by quantum theory as an 
appeal to a vast conspiracy on the part of nature to set initial conditions in such a way as to 
ensure that experiments come out in accord with the quantum-mechanical predictions, so that 
every time I do an EPR experiment it just happens to be the case that the detectors are set in 
a way appropriate to generate the observed correlations. 

What the conspiracy theorist is in effect doing is supposing a non-lawlike suspension of SI. 
That is, she is supposing that the laws of nature are ordinary, local, relativistic laws, without 
any nonlocal constraints, but that the initial conditions are such that it happens to turn out 
that the states of measuring apparatuses are nontrivially, and persistently, correlated with the 
states of the particles they eventually interact with. The idea seems to be that, were the initial 
conditions to have been somewhat different, the entire quantum-mechanical edifice would fall 
apart. Certainly, this is a theoretical possibility, but not a very happy one, for two reasons. 
Were one to maintain that the laws of nature are the ordinary classical ones, with no general, 
nonlocal constraints, and that quantum mechanics is the result of a highly special set of initial 
conditions, one would be foregoing the possibility of explaining the myriad phenomena accounted 
for by quantum mechanics which have nothing to do with measurement, such as the stability 
of matter or the black-body emission spectrum. Although a theory that purports to account 
for the full spectrum of quantum phenomena in a way that does not violate Bell's locality 
assumptions must specify nontrivial correlations between spacelike degrees of freedom, it cannot 
do just that. Rather, the constraints in an S'J-violating theory must account for the full range 
of phenomena accounted for by quantum mechanics and quantum field theory. Thus a truly 
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useful and predictive theory underpinning quantum phenomena is highly unlikely to have the 
ad hoc character which concerns the conspiracy theorist. 



4.2 Free will 

Another worry about giving up SI and postulating generic nonlocal, spacelike correlations has 
to do with a purported threat to our "free will" . This particular concern has been the subject 
of renewed debate in the last couple of years, prompted in part by an argument of Conway 
& Kochen ((7|). The core of the worry is that if detector settings are correlated with particle 
properties, this must mean that we cannot "freely choose" the detector settings. This worry, 
however, appears to be based on a conception of free will which is incompatible with ordinary 
determinism, as pointed out by 't Hooft (|25p . Why is it any more of a threat to free will to 
have our "actions" correlated with other degrees of freedom than it is to have our actions be 
determined by the events in our past? (Conway and Kochen bite the bullet and argue that 
even ordinary determinism is incompatible with free will.) 

One might conceivably make the case that ordinary deterministic theories are fine in a way 
that superdeterministic theories, with their nonlocal constraints, are not by arguing that, in 
allowing that our actions are determined by the past, we are simply granting that our actions 
arise from our own thoughts and inclinations. This (limited) sort of determination is actually 
essential for free choice. This perspective is a version of what is called 'compatibilism' in 
philosophy, the view that freedom of the will is compatible with determinism (|14p . 

A problem for free will would then arise if it were the case that the nonlocal correlations as- 
sociated with an underlying superdeterministic theory somehow prevented an agent from acting 
on its thoughts and inclinations. This is to say that the physical object identified as the "agent" 
would exhibit behavior not explicable in terms of the influences on or in its past lightcone. 
But this is not what is being contemplated here, for this would involve non-Lorentz-invariant 
dynamics. What is at issue are Lorentz-invariant theories with nonlocal constraints. 

5 Theories with nonlocal constraints: two examples 

Theories which have constraints on initial data can be divided into two kinds, local and nonlocal. 
The gauge field theories of the standard model are local, in that the constraint may be expressed 
as a local condition, for example V • -E = 0, the Gauss law (in vacuo). The locality of the 
constraint means that specifying the field at every point outside of an open set surrounding a 
point X does not constrain the value of the field at x. Rather, the field in the neighborhood of 
a point is constrained only by the field at the point. 

Theories with nonlocal constraints are less familiar. These are theories in which specifying 
the value of a field outside the neighborhood of a point x constrains the field at x. We will now 
consider two examples of such theories. 

5.1 Timelike Cauchy surfaces 

Consider the theory of the massless scalar field, given by the wave equation Ocp = 0. In two 
space dimensions, this reads 




(10) 
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where 4'{x, t) is a twice-difFerentiable, real-valued field on spacetime. It is well-known that the 
Cauchy problem is well-posed, meaning that specifying the field 4'{x) and its normal derivative 
d(j){x)/dt on a spacelike hyperplane t = uniquely fixes the field at all other times. More 
important, it is also the case that a solution exists for any such data, meaning that the field and 
its time rate-of-change at each point are independently specifiable. Thus the ordinary initial 
value formulation of the wave equation has no constraints, either local or nonlocal. 

On the other hand one can also specify initial data on a mixed (spacelike and timelike) 
hyperplane ([T7|) Given data (j){xi,t) and d4>{xi,t)/dx2 on the hyperplane X2 = 0, the data 
uniquely determine the solution, if a solution exists at all for that data. The fact that solutions 
do not exist for arbitrary data (except in the case of one space dimension) means that, as 
formulated, the initial value problem is not "well-posed." However, it has recently been shown 
(jlOP that, just as in a gauge theory, one may write down a constraint on the initial data such 
that any initial data satisfying the constraint lead to a unique, stable solution of the equation. 
The resulting problem is well-posed. 

The difference between this constraint and the gauge-theoretic constraint is that the former 
is nonlocal while the latter is local. Specifically, the Cauchy data is given by functions / and g 
such that 

/(xi,t) :=</>(xi,0,t) = J f{h,uj)e"^'''^'-^'^dhdu (11) 
5(xi,t):=M|^= I ~g{h,u;)e^(''^^^-^'Uhdu; 

where / and g are smooth functions of ki and uj, related to / and g by the Fourier transform. 
The functions / and g therefore cannot have compact support (though they may be chosen so 
as to have arbitrarily small tails outside of a finite region). 

The upshot of this example is that the ordinary theory of the massless scalar field, formulated 
in terms of states specified on mixed spacelike and timelike hypersurfaces, is one in which a 
natural generalization of SI to the case of fields (rather than particle states and detector settings) 
is violated. I.e., the natural analogue of P{X\a, b) = -P(A) does not hold. For example, consider 
disjoint compact regions A, B and A on the initial data surface. Let A = (/(A), (/(A)) represent 
the state the field in A, and let a = {f{A),g{A)) and h = {f{B),g{B)) represent the detector 
settings a and b. Then it is the case that, given a generic probability distribution on the space 
of initial data / and g, the probability of A (the restriction of / and g to region A) will not be 
independent of a and b. For example, if the functions / and g vanish in the regions A and i?, 
then it must be the case that they vanish in A (otherwise A would be a region in which / and 
g have compact support, which we know not to be the case). 

Note that, despite the failure of SI in this context, we have perfectly a well-posed initial 
value problem, and we even have compact domains of dependence (see Figure |4]). Here, data in 
R determines data in the region out to E. The nonlocal constraint simply means that data in 
R may not be specified freely. 

5.2 Timelike compactification 

Consider once more the wave equation, this time in one space dimension (for simplicity). And 
consider again the initial value problem, but on an ordinary spacelike hyperplane. However, 

^ These mixed hypersurfaces are sometimes called "timelike" (jTf|) or "non-spacelike" 
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X 



Figure 4: Compact timelike domain of dependence 



suppose that the spacetime on which the field takes values is compactified in the time direction, 
so that the entirety forms a cylinder (see Figure |5]). This, too, is an example of a theory whose 
initial value formulation possesses a nonlocal constraint. 

The reason for the constraint is of course that the solutions must be periodic. Whereas in 
the ordinary initial value problem, initial data may be any smooth functions /(x) = ^{x^ 0) and 
g{x) = (/)(x,0), we now require that 0(x,O) = (/)(x,T) and 0(x,O) = 4>{x,T), where T is the 
circumference of the cylinder. Solutions to the wave equation can be written as sums of plane 
waves, with Fourier space representation 



Since these plane waves must have period T (in the preferred frame dictated by the cylinder). 
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Figure 5: Timelike compactification 
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we have a constraint k = (where n is a positive or negative integer), so that initial data are 
no longer arbitrary smooth functions of k 

4>{k,0) = F{k) + G{k) 
^t{k,0) = -ik{F{k) - G{k)) 

but are rather constrained by the requirement k = Thus the initial data are the functions 

1 °^ o 
</)(x,0) = -^Y. H^,0)e^~^dk 

^ n=— oo 
^ n=— oo 

i.e., they consist of arbitrary sums of plane waves with wave number k = 

The restriction to a discrete (though infinite) set of plane waves means that initial data do 
not have compact support; they are periodic in both space and time. Thus, as in the case of the 
mixed initial value problem, the data cannot be specified freely. However, for sufficiently large T 
or sufficiently small Ax, the local physics is indistinguishable from the local physics in ordinary 
Minkowski spacetime. Only at distance scales on the order of T does the compact nature of the 
direction become evident in the repetition of the spatial structure. 

6 Contextuality 

The Kochen-Specker theorem points toward a kind of contextuality in quantum mechanics, and 
indeed in any theory in which the properties of a system are understood to be independent of 
the properties of other systems. The theorem shows that, for systems described by quantum 
mechanics, the properties of these systems cannot consistently be assigned values if the values 
respect a certain seemingly natural criterion called 'functional composition' (I20p . Functional 
composition is the assumption that the value v{AB) of an observable AB which is the product 
of commuting observables A and B is equivalent to the product v{A)v{B) of the values of each 
observable, as long as the observables commute. Given this assumption, one can show that 
the following set of operators, representing spin observables for a system composed of two spin- 
1/2 particles, cannot simultaneously be assigned values in a way that is consistent with the 
requirement that the values belong to the eigenvalue spectrum of the operators 

I ® (Tz CFz® I CTz® CTz 

dx® I I ®(7x (Tx^CTx (13) 

dx ® CTz dz® (Tx cry(g)ay . 

Rather, the value assigned to a given observable must depend on whether it is being measured 
along with the other (commuting) observables in its row, or the other observables in its column. 

Recent work on generalizing the Kochen-Specker result to any theory admitting an oper- 
ational characterization shows that there is a sense in which any theory that reproduces the 
predictions of quantum mechanics must be contextual (I23p . Without going into unnecessary 
detail, the general idea is that any theory reproducing the predictions of quantum theory must 
be such that the probabilities for various outcomes must in general depend on which other prop- 
erties are (simultaneously) measured. Such a result, however, is utterly unsurprising in a theory 
with nonlocal constraints, so long as one recognizes that the detector orientations themselves 
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are part of the system, since the nonlocal constraint means that the degrees of freedom of the 
detectors are not independent of those describing the particles. Indeed, from a non-operational, 
closed-system point of view, one may view the contextuality implicit in the Kochen-Specker 
theorem as implying the existence of a nonlocal constraint. This sheds light on the relationship 
between Bell's theorem and the Kochen-Specker theorem, in that K-S essentially shows that 
any local hidden-variable theory must violate statistical independence, while Bell shows that 
any statistically independent theory must violate locality. 

7 Future directions 

Neither of the two examples above, examples of theories with nonlocal constraints, appear to 
have any direct connection to quantum mechanics, though the mixed initial value problem might 
be so related. It is certainly worth investigating what sort of theory emerges if one takes, for 
example, the wave equation on three space and two time dimensions (called an 'ultrahyperbolic' 
equation) and considers data on an initial Cauchy surface of 3-|- 1 dimensions. Such a theory will 
also have nonlocal constraints, and might give rise to interesting behavior when the extra time 
dimension is averaged over in such a way as to generate an effective theory in 3 -|- 1 spacetime 
dimensions. The obvious difficulty is that, if the extra time dimension is not compact, there may 
be no obvious choice of measure over which to average. 

Perhaps more intriguingly, one might ponder the way in which the ordinarily superfluous 
gauge degrees of freedom of modern gauge theories might serve as nonlocal hidden variables. 
The vector potential in electrodynamics, for example, ordinarily plays no direct physical role: 
only derivatives of the vector potential, which give rise to the electric and magnetic fields, 
correspond to physical "degrees of freedom" in classical and quantum electrodynamics. The 
Aharonov-Bohm effect shows that the vector potential does play an essential role in the quantum 
theory, but the effect is still gauge-invariant One might nevertheless conjecture that there is 
an underlying theory in which the potential does play a physical role, one in which the physics 
is not invariant under gauge transformations]^ The indeterminacy we associate with quantum 
theory may then arise via epistemic limitations. More specifically, it may be impossible for us 
to directly observe the vector potential, and the uncertainties associated with quantum theory 
may arise from our ignorance as to its actual (and nonlocally constrained) value. From this 
perspective, quantum theory would be an effective theory which arises from modding out over 
the gauge transformations. 

Finally, recent work on decoherence and the emergence of classicality (|26p suggests that the 
emergence of classicality requires very special quantum states. For worlds with a large number 
of subsystems, hence a high Hilbert space dimension, only a measure zero subset of the total set 
of quantum states gives rise to distinctively classical behavior. Thus it seems quite reasonable 
to suppose a similarly strong constraint on the states of a hidden-variable theory. 

*One might also ponder the connection with the closely related way in which energy enters into classical, 
nongravitational physics. In the absence of gravity, only differences in energy are held to be observable, but 
when gravity enters the picture, absolute values of energy are understood to be relevant. Or course, this in turn 
leads to the cosmological constant problem when one attempts to couple a quantum theory of matter to classical 
gravitation. 

^'t Hooft has also gestured in this direction - see (pS) . p. 7. 
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8 Conclusion 



The ideas sketched in the previous section are preliminary, of course, and they are only two of 
many possible ways to construct theories which feature nonlocal constraints. What the reader 
should take away from this paper, if nothing else, is the idea that it is not all that difficult to 
construct nonlocal theories which nevertheless local in the sense of being Lorentz-invariant and 
not allowing superluminal signaling, and that such theories are quite promising as deterministic 
or stochastic models of many of the curious phenomena described by quantum mechanics. 
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